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A Method of Maintaining Data 
Consistency in a Loose Transaction Model 

FIELD OF THE INVENTION 

The present invention relates generally to electronic mformation technology and 
more particularly to electronic data management systems. 

BACKGROUND 

Content management systems store, access, and manage digital information or 
content ,„ netted environments, inc.uding such information generated in e-business 
apphcattons. We„ integrated content has many applications, and for examp.e can 
-elerate business process automation across an enterprise for various sized e-businesses 
Relevant content can include electronic documents, text flies, XML and HTML files 
dtgital audio and video files, scamted images, facsimiles, and the like. 

Content management systems, especiaUy those designed for managing distributed 
content, may a!so store meta-data describing an object or related information in a store tha, 
ts separate from a fi.e containing the object's content. DB2 Universa! Database (UDB) is 
an examp,eofasca,ab.eda,abase tha, can be used in content management systems as a 
database management system (DBMS). The DBMS is an essentia, par. of the system for 
stonng d.gital information or content. Meta-data can be stored in a repository of the 
DBMS. Examples of meta-data include information about data sources access 
authonzation, archive and backup histories, dam accesses and content identification .abels 
(e.g., video, audio, text). 

A particular challenge arising in content management systems is maintaining 
constancy between fi.e content and the associated meta-data from the point of view of an 
app .canon accessing the conten, and meta-data (simply referred .o hereinafter as a reader) 

Iff e^dmem-da.aupda.esaretigh.lycoupkda.e.bomupda.eshappenwiminasingie ' 
umfied transaction), a transaction coordinator .ypicaHy ensures a consistent view by 
iockmg out readers of mcta-data as well as fi.e data unti, the transaction is committed 
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Intermediate / uncommitted updates to either are not visible. However, this approach has a 
number of disadvantages including the circumstance that content edits can require 
considerable amounts of time. This approach also does not allow the convenience of 
directly accessing and updating the file content on the native filesystem, using native 
5 filesystem operations. 

On the other hand, in systems where a loose transaction model is provided, and 
direct content edits are allowed, consistency between file-data and meta-data may not be 
guaranteed at all times. 

10 

A need therefore clearly exists for an improved technique for providing a consistent 
view of file data and meta-data in the presence of a loose-transaction model. 

Management of External Data using DATALINKS 
1 5 Content can be referenced within a database using the DATALINK data type, 

which is part of the ANSI ISO standard described in document ISO/TEC 9075-9 :2000(E), 
"Information Technology - Database Languages SQL - Part 9: Management of External 
Data (SQL/MED)". 

20 Datalinks (DL) is a mechanism that can be used with DB2 UDB, facilitating 

management of files residing outside of a database as though the files are logically within 
the database. Datalinks ensures referential integrity of the external files, provides access 
control for the files, and supports automatic and coordinated backup and restore 
capabilities. In this manner, a coordinated administration point is provided for file and 

25 database data. In the following description, a DATALINK type is referred to for SQL 
management of external data. 

Datalinks maintains a reference to a file residing in a file system in a column of a 
DB2 table. The reference is stored as DATALINK data type, which uses a Uniform 
30 Resource Locator (URL). A table in the database may have one or more DATALINK 
columns. The database may also store meta-data about the files with DATALINK 
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references in the same table or other tables. Thus, a row may contain a DATALINK 
reference and meta-data related to the external file. An SQL query or statement issued by 
an application or reader may be applied against the table on the meta-data, for example, to 
locate the (external) file of interest. The URL retrieved from the DATALINK column of 
the row is then used to access the external file via the native API of the file system or 
browser. 

Fig. 1 illustrates the syntax of the DATALINK data type, including options with 
respect to write permission, namely FS and Blocked. A DATALINK value references a 
file that is not part of the SQL environment. The file is assumed to be managed by an 
external file manager. A DATALINK value is represented by a file reference, which is a - 
character string referencing an external file, and is input and retrieved by invoking a 
built-in scalar function. A file is linked to the SQL environment when a SQL insert 
statement causes a value that references the file to appear in a datalink column whose 
descriptor includes the link control FILE LINK CONTROL. If the read permission option 
is DB, access to the referenced data source is . SQL mediated. If the read permission option 
is not DB, access to the referenced file is determined by the file system and file manager. 
NO LINK CONTROL does not cause any file to be linked to the SQL environment. 

Further information regarding the datalink structure shown in Fig. 1 can be found in 
Section 4.8 "Datalinks" at pages 31-35 in the ANSI ISO standard of ISO/TEC 
9075-9:2000(E), "Information Technology - Database Languages SQL - Part 9: 
Management of External Data (SQL/MED)". 

25 DataLinks supports two modes of "WRITE PERMISSION" : FS (File System) or 

BLOCKED. In the FS case, users are allowed to update a file while the file is linked to the 
database (i.e. the database has a DATALINK value which is currently pointing to the file). 
However, this mode does not provide file data recovery, which means if the disk crashes or 
a user needs to restore the database, there is no file backup data to recover from. This 

30 could cause inconsistency between the file data and the database data after RESTORE / 
ROLLFORWORD to a point in time other than the time of the crash. Moreover, WRITE 
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PERMISSION FS does not support the SQL mediated access model like the one provided 
in READ PERMISSION DB. 

On the other hand, WRITE PERMISSION BLOCKED provides data recovery for a 
file with a reference in a DATALINK column. However, a user cannot update the file 
while the file is currently linked. In order to modify the file, the user has to unlink the file 
in one transaction, modify the content and link the file in another transaction. So between 
the two transactions, the file is not linked, which means it is not under the control of the 
database (DataLinks) and the file reference is not visible in the database. 

Fig. 4 is a block diagram of a system 100 for managing links between a database 
80 and external data files 20 referenced in the database 80. The example chosen for 
illustrative purposes is an employee records database, such as might be utilized by a 
corporation. The database 80 records reside within a DBMS 50. The external data files 20 
are located in a native file system 75 and not in the database 80. An example of the file 
system 75 is JFS on ATX™, or NTFS on Windows NT™ products. 

The database 80 stores an employee table 82 including a name column 83, a 
department column 84, and a picture column 81. The name column 83 typically contains a 
string, the department column an integer, while the picture column would contain a 
reference to an image stored in one of a number of external data files 20. The external data 
file 20 may also contain other forms of data, including documents, presentations, 
engineering drawings, spreadsheets, or video clips. 

It would be undesirable for any one of the external data files 20 to get deleted, 
modified, or renamed while that external data file 20 continues to be referenced by the 
database 80, and in particular the picture column 81. A Datalinks File Manager (DLFM) 
60 is therefore provided for maintaining a table (not illustrated) of linked file references. 
The DLFM 60 may also be responsible for controlling access permission and ownership of 
the external data files 20. The table maintained by the DLFM 60 contains attributes and 
subsets of the data stored in the external data files 20 along With lopiral rpfprpnr^c tr, tht> 
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In operation, an application program 30 searches the database 80 via a SQL 
Application Programming Interface (API) request 40 to identify database entries 80 of 
interest. In the example illustrated in Fig. 4, this typically occurs when an employee record 
5 is viewed and there is a requirement for the image of the employee to be displayed with 
his/her other information. 

The DBMS 50 is responsible for managing the links between the database 80 and 
the data files 20 and provides to the application 30 references to the selected external data 
1 0 files 20 via a handle. An example of the DBMS 50 is the DB2™ Universal Database 
product of International Business Machines Corporation. The application 30 can now 
access the external data file 20 directly using standard file-system API calls 70. Typical 
file-system API calls 70 are "file-open", "file-read" and "file-delete". 

15 A Datalinks File System Filter DLFF 10 is a thin, database control layer or filter 

on the file system 75. The DLFF 10 intercepts certain of the file-system API calls 70 
issued by the application 30. The file-system API calls 70 that are intercepted include 
file-open, file-rename, and file-delete calls. If the external data file 20 is referenced in the 
database 80, the DLFF 10 is responsible for enforcing referential-integrity constraints and 

20 access-control requirements defined for the data file 20. The referential integrity 

constraints ensure that the reference to data files 20 remains valid as long as it is "linked" 
to the database 80. DLFF 10 also ensures that any access call 70 meets Database 
Management System (DBMS) 50 access control requirements. An example of SQL 
mediated access is in the use of an authorization token, which is generated by the DBMS as 

25 part of the handle and verified by DLFF. 

The DLFF 10 validates any authorization token (not illustrated) embedded in the 
file pathname for a 'file-open' operation. For example, when a user of application 30 
submits a SQL API Request 40 to retrieve the employee picture (image) from the database 
30 80, the DBMS 50 checks to see if the user has permission to access the employee table 82 
containing the picture column 81. The permission may include select and view privileges 
on the employee table 82. The DBMS 50 returns the file name of the external file 20 to the 
application 30 only if the user has the required permission. An authorization token is 
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DBMS 50 The apphcation 30 then uses the file API ca.l 70 ,o retrieve ,he image from the 

•he nat ve file system 75. If the vahdation fai ,s ; th e file-open request is rejected. Once 
access as been authored by a valid token, the apphcation 30 interacts direc„ y with the 
file system 75 for delivery of the externa, fi,e 20 without the need for the DLFF ,0 to 
^^^^^^^^^^^ 

to a native file system. 



SUMMARY 

taaCCOrdanCewithafe '-pec,ofthei„ve n ,ion,thereisprcvideda m e tt ,od„f 
mamtammg consistency of content of an object and tneta-data related to the object in a 

loosetnnsactionrnode.forobjectandn.eta-dataupdates.Thernethodincludesthesteps 
of stonng the re,a,ed meta-data and a reference to the object in a table of a database the 
object being stored externally to the database in an object store, the reference used to 
obtatn a handle for directly accessmg or manipulating mc extema| objec(; 
verston number embedded in the handle and comparing the embedded version number with 

■f the handle refers to a current version of the externally stored object. 

Preferabiy the method further includes the steps of, if the encoded version number 
and the version number of the ,a.es, committed version match, comparing a las. 
mediation time stamp of the object with a las, modification time stamp for the lates, 
committed version of the object; and if the last modification time stamp of the object 
matches with the las, modification time stamp for the latest committed version of the 
object, permitting access to the extemaUy stored object. The method may further include 
the step „f, if me last modification (ime s(amp of ^ ob . ec( ^ ^ wift ^ ^ 

modtficationtimestampforthelatesteornmittedversionofmeobjcct.generatinganerror 
to mdicate that the handle refers to stale content in the object. 
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Preferably the method further includes the steps of updating the object in-place 
under either DBMS control or file system control and linking the meta-data and the object 
under DBMS control. 

5 Preferably the method further includes wherein the loose-transaction update model 

uses SQL Mediated Object Manipulation (SMOM) for an object that resides external to the 
database. 

Preferably the method further includes the step of intercepting a native access to the 
10 externally stored object or a file system and validating the caller's access rights based on a 
combination of the version number and a last modification time stamp for a version of the 
object. The method may further include the intercepting step being carried out using a 
filter layer of the object store for the stored object. 

15 Optionally, the object store is a local file system, or a distributed file system, the 

object being accessed from a remote file system client. A file access which occurs in the 
presence of authoritative caching and the comparing steps may be performed at the file 
system client. The method may further include the steps of caching the last known version 
number and the corresponding last modification time stamp at the file system client after an 

20 access and refreshing the last known version number and the corresponding last 

modification time stamp with latest values from a file server the next time one or both of 
the comparisons fail with the previously cached values, in which case the comparing steps 
are retried with refreshed values. 

25 Preferably the method further includes the object which is a file. 

Preferably the method further includes the version number associated with the 
object is embedded in an access token. 

30 Preferably the method further includes the version number is temporally unique. 
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Preferably the method further includes the last-modification-time stamp attribute 
associated with the object is maintained by the object store. 

In further aspects of the invention, an apparatus, a computer program, a computer 
5 program product, and a system for maintaining meta-data and file-data consistency in a 
loose transaction model of object and meta-data updates are provided. 

BRIEF DESCRIPTION OF THE DRAWINGS 

A small number of embodiments of the invention are described hereinafter with 
1 0 reference to the drawings, in which: 

Fig. 1 illustrates the syntax for a DATALINK data type when defining an SOL 

table; 

15 Figs. 2A and 2B are block diagrams illustrating a content management system in 

accordance with an embodiment of the invention; 

Figs. 3A, 3B, and 3C are flow diagrams illustrating control processing of the 
content management system of Fig. 2; 

20 

Fig. 4 is a block diagram of the Datalinks architecture showing where a Datalinks 
file-system filter (DLFF) fits in; 

Figs. 5A and 5B are flow charts for checking consistency in a local file system and 
25 a distributed file system, respectively; 

Fig. 6 is a block diagram of the system architecture for a local file system in 
accordance with one embodiment of the invention; and 

30 Fig. 7 is a block diagram of the system architecture for a distributed file system in 

accordance with a second embodiment of the invention. 
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DETAILED DESCRIPTION 

A method, an apparatus, a computer program, a computer program product and a 
system for maintaining (or ensuring) meta-data and object-data consistency in a loose 
5 transaction model of object and meta-data updates, from the perspective of a reader 

application, are described hereinafter. Preferably, the object may be a file and an object 
store may be a file system. In the following description, numerous details are set forth. It 
will be apparent to one skilled in the art, however, that the present invention may be 
practised without these specific details. In other instances, well-known features are not 
10 described in detail so as not to obscure the present invention. 

With the embodiments of the invention, allowing file content to be accessed and 
updated in-place by directly accessing the file system natively is more convenient. , 
Besides, since content edits can be relatively long running,Jja loose transaction model for 



1 5 file and meta-data updates is useful where the file can be edited independently of the 



meta-datij A user can update a file while it is currently linked. In addition^ the meta-data 

for that file can be accessed by other users while the file is in the process of being updated. 
The meta-data is always visible, even while the file is being update. However, access to the 
file may be denied. SQL mediated access can be extended to allow update-in-place. 
20 Specifically, the WRITE PERMISSION clause in Fig. 1 can be extended to achieve this. 

The embodiments of the invention address the noted problem in the context of 
meta-data maintained in a database table associated with an external file reference to 
content that is stored in a file system or an object store external to the database. 

25 

Generally, a file update takes longer than meta-data updates or typical database 
transactions. In a typical update, access to the file is requested and the file is locked. The 
content of the file is modified as well as corresponding meta-data generated. The file is 
then unlocked (i.e., the file update is declared to be completed) and the meta-data is 
30 imported into the database. Changes to the content of the file happen outside a regular 
DB2 transaction. 
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In particular, the problem addressed by the embodiments of the invention is 
ensuring that a reader application always gets a consistent view of the content of a file and 
the meta-data related to the file, given a loose transaction model for file and meta-data 
5 updates. An error can acceptably be returned in situations where the file data being 
accessed is not consistent with the corresponding meta-data seen by the reader. 

Solutions are provided by the embodiments of the invention, firstly for files 
accessed from a local file system 700. This is then extended to a distributed file system 
10 environment 800 for files accessed from remote file system clients in the presence of 
authoritative caching. 

Fig. 6 is a block diagram of a system 700 for use with a local file system in 
accordance with a first embodiment of the invention. An application 710 has a SQL path 

1 5 722 for meta-data access to a meta-data server 720 and a file system API 732 for direct file 
access to a file server 730. The meta-data server 720 has a database management system 
(DBMS) 724, which directly interfaces with the SQL path 722. The DBMS 724 also has 
meta-tables 726 with external file references. The DBMS 724 and Meta-data tables are 
coupled to a store 728 for storing meta-data. The file server 730 has two modes: user 760 

20 and kernel 770. The user mode 760 includes a local repository 734 and a file manager 

daemon 736. A control path 740 extends between the DBMS 724 of the meta-data server 
720 and the file manager daemon 736 of the file server 730. The control path 740 is 
indicated by a double headed dashed arrow. The kernel mode 770 includes a virtual file 
system interface 750, which is bi-directionally coupled to the application 710 via the file 

25 system API 732. In turn, the virtual file system interface 750 is bi-directionally coupled 
with an interceptor/filter module 752. An upcall path 738 also bi-directionally extends 
between the file manager daemon 736 and the interceptor/filter module 752. The 
interceptor/filter module is bi-directionally coupled to the native file system 754 (e.g. UFS, 
JFS, NTS). The native file system 754 is in turn coupled to a store 756 of the file server 

30 760 for storing one or more relevant objects or files. 
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Fig. 7 is a block diagram of a svstem son *x 
810 has a SOL nath 899 fx ♦ , invention. An application 

' -Occupy 

820 is the same as the meta-data server 720 of F' 7 

s- 1JJ3M;> /24 is referred to as DBMS 824 in m a ™ TL 
mode 860 of the client «nt„ , g - 7 ^ The user 

client 830 has a pr0X y daemon 890. The file system API 81? ,• 
b etween the application 81f) fi, • , system API 832 is coupled 

'0 870 of the c,I S3 I ' ^ ^ 850 °^ k ™> ™>* 

" or me client 830. The virtual file system interface 850 a- ■ 

thein.erceptor/fil,ermodule852 A ,, b-CreConally coupled with 

The file server 880 includes a local repository 887 - <-, 
A control path 840 bi-directinn n , 6 daemon 8 84. 

P n 54U bi directionally extends between the DBMS 824 nfth . * 

820 and the fi,e manager daemon 884 of the f„e server 88 ^ ° f ~>* ™ 
20 daemon 884 of the file server 880 w „• f " e 

— 830. The fi e « 1 ' "'T^ " "* «* — — - « of 
^buted file system ™ 7 3 ^ ^ ^ — «" "» 

^ --e^butedfilesylZZr^^ 

732832™^^"°°" 

'0 filename to the filesystem API TW, 7 " S SUPP ' ied 35 "» 

~ta,s U pdatediIIeTl: e I 1^ ^ ^ " 

728, 828. To effect the corresponding meta-data 
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updates and relate those updates to the file updates, the user issues an SQL update through 
the mediator DBMS 724, 824, optionally passing in the object handle, and commits both 
file and meta-data updates together. Another possibility is that the file may be updated 
based on the filesystem permissions. To effect the corresponding meta-data updates and 
5 relate those updates to the file updates, the user issues an SQL update through the mediator 
DBMS 724, 824, and commits both file and meta-data updates together. 

Preferably, a filter layer 752, 852 transparently intercepts native access to the 
external object store or file system 756, 856 and 888, making use of the embodiments of 
10 the invention, to allow or deny access to the object based on a combination of an implicit 
version number and the last modification timestamp for a given version. This is done to 
validate the consistency of the meta-data retrieved by the reader and the content of the file. 

1. The Update Model 

The file edits happen outside of the DB transaction. The file updates get committed 

15 to the database (DB) via the DBMS 724, 824 once the update is done. DBMS 724, 824 
communicates with the file management mediator daemon (also referred to as DLFM in 
the Datalinks product) 736, 884 passing a version number or generation number associated 
with the file, which is temporally unique on every committed update to the file. The file 
management mediator daemon 736, 884 utilizes a local repository (i.e. local to the 

20 fileserver node) 734, 882 to store the version number attribute of the file. (The repository 
734, 882 could be a local database, as is the case of the embodiment implemented as part 
of the DB2 Datalinks product). The file management mediator daemon 736, 884 also 
saves a last modification timestamp attribute of the file maintained natively by the file 
system (or external store) as the last modified time of the latest committed version of the 

25 file. 

This file management mediator daemon 736, 884 utilizes the local repository 734, 
882 (i.e. local to the fileserver node) to store the attributes of these objects at the time of 
their association with the DB, and at the time of committed updates. (Again, the repository 
30 could be a local database, as is the case of the embodiment implemented as part of the DB2 
Datalinks product) 
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The solution provided by the embodiments of the invention extends the 
functional of this filter layer 752, 852 and the associated file management mediate, 
aemon 73o , 840 to ensure ^ ^ ^ ^ ^ 

for fi,e accesses where the file content are no longer consistent with the tneta-data 

"^^.''^^"--^cetheobjec,. Preferably, the error code is 

i auj in the Unix operating system. 



2. Solution 

10 2.1 Observation 
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I. may be observed that the token (embedded in the hand,e used to reference the 
object) becomes obsolete in the following situations: 

Firs, if uncommitted updates are made to the file any time before the token 
» used OR ge, committed any time after the token was generated. Second if 
a token is issued and the database is restored to a prior time before the token 
is used. 

^—ommirtedfileupdatesmayhavehappenedafterthetokenisgenerated or 

ab.ea, the timethe token was generated corresponds to the ,ast committed file data and 
does not reflect uncommitted fi,e updates tha, happened before the token was generated. 

Such uncommitted file updates can happen via a re-unking 0 f the same fiIe (aftef _ 
unlink and a change in content via a rename or a write to the file), or through an SQL 
Update commit for an update-in-plaee. There are two cases here: 
1 ■ Updates committed before the token is used; and 
2. Updates not yet committed before the token is used 
(update-in-progress state). 
This leads us to the solution provided by the embodiments of the invention. 
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2.2 Basic Solution for the Local File system Case 

The solution provided by the embodiments of the invention is based on the following 
constituents: 

1. A version number or generation number associated with the file, which is 

temporally unique on every committed update to the file: 

Since updates are committed with the knowledge of the mediator DBMS in 
communication with its file/object management daemon component on the 
file server, the latest version number for the object reference associated with 
the meta-data store can be maintained both in the DBMS and in the local 
repository on the file server. Note, that such a generation number is also 
required for point in time recovery purposes. In the embodiments under 
consideration for DATALINKS, the recovery id associated with the file, 
which is derived from the LSN (Log Sequence Number) of the transaction in 
which the update was committed, may be used as version number. 

2. A last modification timestamp attribute of the file maintained natively by the file 

system (or external store): 

Whenever updates are completed, the last modification timestamp of the file 
at the time of update completion is noted in the local repository as the last 
modified time of the latest committed version of the object. 

It is more efficient if the last modification timestamp is recorded when an SQL UPDATE 
is issued for the file, rather than at the time when the transaction is finally commits. This is 
possible because in this model, no further updates to the file are allowed until the 
transaction commits. 

The approach works as follows: 

1. The DBMS mediator 724, encodes the latest version number (or generation 
number) as part of the embedded token in the handle provided for referencing the file. 

2. When the user supplies the above handle directly to the file system to open the file, 
the filter layer 752 which intercepts these file system operations makes an upcall 738 to the 
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file manager, daemon 736 to perform the process 600 shown in Fig. 5A of checking ,o 
ensure consistency at the point of open. 

m step 610 of Fig. 5A, the latest committed version number V(commit) of 
the file and the corresponding modification timestamp Tm(commit) 
recorded in the loca! repository 612 are looked up. In step 614, the version 
number encoded in the embedded token in the handle, V(token) is 
extracted. In decision block 616, a check is made to determine if V(commit) 
« equal to V(,oken). If decision block 616 returns false (NO), the token is 
mconsisten, with the current file content and an error is returned indicating 
stale data. This covers the case where the reader obtained the access 
token/handle any time prior ,o the star, of the update that was last 
committed. It also covers the situation where the reader had obtained the 
token and the metadata, and then in the meantime the database and the file 
got rolled back to an earlier state. 

If decision block 616 returns true (YES), in step 618, the current 
value of the last modification timestamp of the file Tmfaccess) is obtained 
Processing continues in decision block 620. In decision block 620, a check 
is made ,„ determine if the modification timestamp Tm(c„mmit) equals the 
t-mestamp Tm(access). If decision bock 620 returns true (YES), the file and 
meta-data are consistent and access is allowed in step 622. Otherwise if 
decision block 620 returns false (NO), there are uncommitted updates to the 
file's content, indicating an inconsistency and an error is returned in step 
624. This covers the case where an updater has modified and then closed 
the file after releasing any file locks, but has no, updated the corresponding 
meta-data. Then, optionally, if the supplied handle has write-access the 
repos.tory state is checked to rule out the case where the updates have been 
made using the same access token. (This is ,o address scenario 6b in Sec 
3.1 hereinafter). 

App.ica.ion level file locking must be used by readers and updaters to cover 
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concurrent me read and updates and, in particular, to ensure that there is no open file 
descriptor when an updater starts updating the file. 

2.2.1 Encoding the Version Number in the Token 

fo,,ows. The '° ken Prefcrably CO " tainS 3 ° ne - WayhaSh * -hich is computed as 

h = hash(K, Tx, file name, servername, token length, flags), 
where K is a secret symmetric key and Tx is the expiration time and the flags contain 
^formation about the token i ncluding which type of access is granted. Tx, token lengm 
and flags are also passed as part of the token. For cases where consistency is to be 
mamtained between the DBMS meta-data and the object, the token can be setup to contain 
a version number and a different one-way hash value, H, computed as: 

H = hash(K, Tx, file name, server name, token length, flags, Vc), 
where Vc is the latest committed version a, which me token is created and flags can 
contain information indicating that consistency is requested. 

On the file manager side 736 for token validation, the file system filter 752 passes 
*e token up to the file manager daemon 736. If the flags indicate .ha, consistency is 
requested, the daemon 736 vahdates the token hy emulating H where Vc is obtained from 
frielas, committed version number stored in the daemon's repository tables. The daemon 
736checks that the token has no, expired and compares H with the one way hash contained 
within the access token. 



25 



Steps 6,4 and 6.6 in the previous section maybe modified to just use the one-way 
hash value as the basis for checking for a version number match, instead of requiring tha, 
the version number be extracted from the token/handle. Step 6,4 could instead jus, 
retrieve the hash function portion of the token, H(token) and compute the one-way hash 
value H(c„mmit) corresponding to V(commit), white me comparison for detecting 
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staleness in step 616 may be replaced by "if H(token) * H(commit)". 

A content management system is described with reference to Figs. 2A and 2B. In 
particular, Figs. 2A and 2B are block diagrams illustrating the content management system 
5 200. The content management system includes a database 202, preferably a DB2 UDB, 
and a file system 250. The database 202 contains one or more tables 204 (only one is 
shown to simplify the drawing). Again to simplify the drawing, the table is shown with 
only two rows and two columns. However, these numbers are for purposes of illustration 
only and differing numbers of columns and rows may be practiced. Column 1 ("COL1") 
10 contains meta-data that is searchable, and column 2 ("DL1") is the DATALJNK type 
column enabled with options to provide in-place updating of a DATALINKed file, 
token-based access control on a write operation, and file data recovery. 

The table 204 may contain more than one DATALESfK column. The entry for the 
1 5 first row has " 1 000" in column 1 and URL' in the DATALTNK column DL1 . The second 
row has "1019" in column 1 and NULL in column DLL As indicated by a dotted line, 
URL' in row 1 of the table 204 refers to file 252 in the file system 250. The file 252 is 
stored externally to the database 202, and the content of the file 252 may include electronic 
documents, text, XML and HTML, digital audio and video, scanned images, facsimiles. 
20 The foregoing are merely examples of the type of the file 252. Other file types may be 
practiced without departing from the scope and spirit of the invention. 

The content management system 200 also includes a DB2 agent 220 coupled to the 
database 202, a Data Link File Manager (DLFM) 230, and a Data Link File System Filter 
25 (DLFF) 240. An application 210 is a reader in the system 200 of Fig. 2. 

The operation of the content management system 200 is described with reference to 
the flow diagrams of Figs. 3A, 3B and 3C illustrating the control process 300. The 
numbered arrows of Fig. 2 extending between elements shown in that drawing have 
30 counterpart steps shown in Fig. 3 to describe the operation of the system 200. The control 
process includes steps 260-278 and 290-296. 
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With reference to Fig. 2 A, arrow 260 extends between application 210 and DB2 
agent 220. In step 260 of Fig. 3 A, the application 210 requests a write access token from 
the DB2 agent 220 via a SQL SELECT statement. 

5 

Arrow 264 extends from DB2 agent 220 to application 210. In step 264, a write 
access token is returned. The write access token includes a hash value that encodes the 
latest version number. Arrow 266 extends between the application 210 and DLFF 240. 
The application 210 opens the file for a write operation using a write access token via the 

10 DLFF 240. Arrow 268 extends from the DLFF 240 to the DLFM 230. The DLFF 240 
forwards the write access token to the DLFM 230 for token validation. If the token is 
valid, then DLFM also checks if the file-data is consistent with the meta-data 
corresponding to the token. Arrow 270 extends from the DLFF module 240 to the 
application 210. In step 270 of Fig. 3 A, the file system returns the file descriptor to the 

15 application 210. 



Arrow 272 extends from the application 210 to the file 252 in the file system 250. 
In step 272, the application 210 modifies the file 252 using the file descriptor. Arrow 274 
extends from the application 210 to the DB2 agent module 220. In step 274, once the file 

20 update is completed, the application 210 issues a SQL UPDATE to the database DB2 202 
via the DB2 agent 220 and imports other meta-data into the database. Arrow 276 extends 
from the DB2 agent 220 to the DLFM 230 in Fig. 2A. In step 276, the DB2 agent 220 
invokes a DB2-DLFM update API and update the DLFM meta-data. At this stage, the 
DLFM determines the last modification timestamp of the file (same as the Tm(commit)) 

25 and records that timestamp in its local repository along with the updated version number 
(same as the link recovery id). 



With reference to Fig. 2B, an arrow 290 extends from the application 210 to the 
DB2 agent 220. In step 290, a SQL COMMIT is issued by the application 210 to the DB2 
30 agent 220. An arrow 292 extends between the DB2 agent 220 and the DLFM 230. In step 
292 of Fig. 3C, a PREPARE statement is sent to the DLFM 230 by the DB2 agent 220. An 
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arrow 294 extends from the DLFM 230 to the DB2 agent 220. In step 294, the DLFM 230 
hardens the DLFM meta-data and replies OK to the DB2 agent 220. Harden means 
guaranteeing that local repository modifications made by the DLFM for the file are 
remembered persistently (i.e., locally committing/hardening the information). An arrow 
296 extends between the DB2 agent 220 and the DLFM 230. In step 296, a final COMMIT 
statement (phase 2) is sen. to the DLFM 230 by the DB2 agent 220. This completes the 
control process. 



20 



30 



In the case of a read operation, the flow is as follows. The application 210 requests 
' a read access token from the DB2 agent 220 via a SQL SELECT statement A read access 
token >s returned. The read access token includes a hash value that encodes the latest 
version number. The application 2! 0 opens the file for a read operation using a read access 
token via the DLFF 240. The DLFF 240 forwards the read access token to the DLFM 230 
for token validation. If the token is valid, then DLFM also checks if the file-data is 
consistent with the meta-data corresponding to the token. This is how the DLFM performs 
the check. The DLFM looks up the latest committed version number V(comrmt) and the 
correspondmg modification timestamp Tm(commit) in its local repository (R). Then the 
DLFM determines if the version number embedded in the token, i.e. V(,oke„) matches 
V(comm,t) (preferably by computing the one-way hash value corresponding to V(commit) 
and comparing it with the hash value embedded in the token). In case of a mismatch the 
DLFM returns an error to DLFF. Otherwise, the DLFM obtains the last modification 
fmestamp of the file Tm(access) from the filesystem, and determines if Tm(access) is later 
man Tm(commit). If so, men the DLFM has detected an inconsistency and retnms an error 
to DLFF, which reports ESTALE to the appheation. If not, then the DLFM returns OK and 
DLFF returns success. The file system returns the file descriptor to the application 210 for 
a successful open. The application 2 1 0 reads the file 252 using the file descriptor. 

2.3 Extension to a Distributed File system Environment 

A further embodiment of the invention extends the above solution to a distributed 
environment 800 of Fig. 7 for accesses to the file from distributed file system clients whi.e 
mmimizmg performance overheads in communication with the central file server 880 
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where ,he file manager daemon's repository 882 is located, especially in situations where 
the fi.e system ciient 830 is likely to service accesses from the client's cache without 
connecting to the server 880. The distributed fi,e system 800 is assumed to imp.emem its 

own cache coherency mechanisms. 

A proxy daemon 890 may be set up a. each client 830 to service upcalls 838 from 
.he filter module 852, which intercepts fi,e accesses on the client 830. This proxy daemon 
890 contacts the central server 880/ repository 882 when required. 'As described 
hereinbefore, the requirement is to minimize these contacts, especially for read 
performance in the most typical cases (i.e. repeated accesses to the same file should be 
servtcable from the cache without contacting the server as far as possible). 

First Time Arrpgg 

The first time a file is accessed with a token a , the client, the cfien, proxy daemon 
890 contacts the remote cental repository 882 and makes an entry in a local (possibly 

in-memory) table with: 

• the latest committed version number, V(commit) of the file, 

• the value of the corresponding file identifier, Fid(commit) of the latest version 
of the file, and 

• the modification timestamp of the file when updates to i, were last committed, 
Tm(commit). 

The check for stale data is: 

is_stale= (V(token)*V(commit)) OR (Tm(access) Tm(commit)). 
Cach ed Case (Rep eated Arr MW ). 

Thenex, time the same fi.e is accessed (either with me same or a different token) 
•heproxy daemon 890can simply look up the proxy daemon's local table entry and if ' 
F.d(access) = Fid(commit), then the proxy daemon 890 can use the in-memory entry to 
perform the above check. 

• If is.stale is FALSE, access can continue without a need ,o contact the server 880. 
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• Otherwise, the proxy daemon 890 needs to refresh the entry from remote central 
repository 882 and perform the check again. 

Fid(commit) and Fid(access) refer to the file-id value at the time of commit and 
time of access respectively, i.e. Fid(commit) = value of Fid (Filename, Tm(commit)) 
where Fid(f, t) = value of unique file id of the file named "f ' at time "t". 

The reason for explicitly introducing these in the distributed case, is to correctly 
handle cases where the file may have been unhnked, renamed and relinked under the same 
name (as in Scenario 2). By indexing V(commit) and Tm(commit) against Fid(commit) in 
the local in-memory table, and looking up the cached entry against Fid(access). This 
ensures that the version number and Tm values refer to the same file, so that a comparison 
with Tm(access) makes sense. This is important in the distributed case, because the cached 
-value of V may be stale, making Tm(access) the real determinant of whether the entry 
should be considered stale or not. 



The unique file id maintained by a distributed filesystem usually includes some 
kind of per inode generation number of its own to account for reuse of inode numbers after 
files are deleted. This inode generation number is typically incremented every time an 
20 inode number is reassigned to a newly created file. This ensures the temporal uniqueness 
of a file identifier. 

Just as in the local case, if Tm(access) > Tm(commit) and the token is a write 
token, an optional step of checking the file if an update-i„-progress with the same 
token/user may be involved (e.g. by checking state information in the repository) 



25 



This approach causes the client to contact the central repository on the server only 
if the file changed or got relinked since the last time the entry was looked up. The 
in-memory table entries can be flushed/reused based on the expiration time and resource 
30 limits. 
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F.g. 5B is a flow diagram illustrating the process of checking consistency of 
meta-data and file data in a distributed file system in accordance with .he second 
embodiment, In step 656, a check is made to determine if the present case represents a 
"firs, „me» access of the file. If a firs, time access is being made, processing starts a, step 
660. Otherwise, step 660 is skipped and processing starts at step 664. In step 660, the las, 
commuted version number, V(comm,t) and the timestamp Tm(commif> are looked up from 
me reposnory (i.e., remote repository 662 corresponding to repository 882 in Fig 7) and 
the variab.e REFRESHED is se, ,o tiue. When s,ep 660 is skipped, cached values of 
V(commif) and Tm(commit) for the file are used instead in me subsequent s,e P s 
Processing continues a, step 664, in which the version number V(token) is extracted from 
■he token. In step 666, a check is made to determine if V(commif> is equa, ,„ V(,oken) If 
dec,s, 0 „ block 666 returns false (NO), processing continues a, decision block 672 
Otherw,se, if decision block 666 returns true (YES), processing continues a. step 668. 

In Step 668, the current value of the timestamp Tm(access) is obtained from the file 
in decision block 670, a check is made ,o determine if the timestamp Tm(commit) is equa, ' 
to Tm(access). If decision block 670 returns fa,se (NO), processing continues a, decision 
block 672. If decision block 670 returns true (YES), the file and meta-da,a are consistent 
the vanable refreshed is se, to fa.se, and access is allowed to the object in step 622. 

In decision block 672, a check is made ,o determine if to cache was accessed (i e 
REFRESHED equals false). If decision block 672 returns true (YES), processing 
continues at block 660. Finally, if decision block 672 returns fa,se (NO), the meta-data and 
file data are mconsistent and an error is returned in step 624. 



i Scenarios Analyzed 

This section discusses various key scenarios and how the embodiments handle each 
of these situations to prevent potential inconsistencies from arising. 
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Initial Assumption : 

It is sufficient to guarantee the consistency at the point of the file/object open. The 
user is expected to use some form of file/application-level locking as in a normal file 
system to ensure serialization between concurrent readers and updaters. Specifically, an 
updater must not start its updates if there is any reader that has opened the file. Otherwise, 
the reader could see inconsistent data. Table 1 is a legend of terminology used hereinafter. 

TABLE 1 



Term Meaning 

Qr(f, t) user does a SQL query and an access token for a file f is 
assigned at time t. 

Qw(f, t) user does the SQL query for a write token and the write token 
for the file f is assigned at time t. 
C(f, t) Content of the file f at time t. 

Rs(f, t) time when the reader opens the file f with access token for 
generated at t. 

Re(f, t) time when the reader closes the file f (finished reading the file) 

corresponding to Rs(f, t). 
Us(f, t) time when the writer opens the file f for update using the write 

token generated at t. 
Ue(f, t) time when the writer closes the file f (i.e. finishes writing the 
file) corresponding to Us(f, t), but not yet committed to the 
database (DB) using SQL update. 
Uc(f, t) time when meta-data in DB is updated and committed together 
with the file-data updates for the file f. 
1(f) time when the file f is linked (inserted into) to the database. 
D(f) time when the file f is unlinked (deleted) from the database 

(restore on unlink). 
Us(f) time when the writer opens the file for an update without a 
token. 
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Ue(f) time when the writer closes the file (finishes wnting the file) 
corresponding to Us(f). 
M(f , f) time when the file f is moved to f (could be rm f, mv f f) 
results in f getting replaced by f - both data + attributes 
(including timestamp) C(f, ta) <= C (f , tb), where ta < M(f f) < 
tb. J 



5 



— Add ressinp Potential Tnrnn«icf^ cv Scenic 

The a ppllca t lon is responsible for ensu . ng semiization) go ^ ^ 

happensunderasharedfilelockandUs-Uehappensunderanexclus.vefi^^ 
Howev^multipleUs.Ue^ 

and an Rs-Re can happen between two such Us-Ue sequences. 



1. 



Query - Unlink - Update - Relink - Access 

A file is unlinked and linked back again after mortfying the content of the 
file (not in-place update). The file is then accessed usmg . token generated 
prior to the unlink. 

11(f); Qr(f, t); DI(f); Us (f); U(f); 12(f); Rs(f >t ) 

File data consistent with W , ,) is C (f, „ (f)) or C(f; ,, C(f , Rs(f>t)) 
C(f, t) smce the file has been written to in between. 

With the embodiments of the invention, Rs (f,t) fa i,s since V(,oken) < 
V(commit), i.e., the version-id embedded in the read-token is earlier man the 
one in 12(f). I fI2 (f) had not ^ ^ ^ ^ ^ ^ ^ 

not DB and the read token is embedded in the file pathname 



2- Query- Unlink - MoveTo - Relink -Access 

The f„e is unlinked and linked back again after renaming another file to mis fi,e 
name. The file ts men accessed using a token generated prior ,o the unlink (that token w, 
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meant to represent the original file, not the 



just renamed one). 
IK*); Qr(f,t); Dl(f); M(f,f); 12(f); Rs(f5t) 



File data consent wxth Qr(f, t) is C(f, 11(f)) or C(f; t) c 
M(f,f), which differs from C(f, t) The file has ch . ( 
between ged t0 * different one * 



20 



25 



then UP ^ m PlaCe ^ e com P lete SMOM sequence) and committed The file is 
<Wt .1); Qw(f, Q); Us(f> ^ Ue(f _ Q}; Uc(f _ G); ^ u) 

U) as the flIe intent has been updated. 

l - Update-file in place - Ouerv tt a . 

P dee guery - SQL Update commit - Access 

The file is updated in place and committed. The file is then „ 
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was issued ,„ compiete «he copending meta-data updates. 

m Qw« „); Us(( Ue(f , a) , Wf , Q) . Uctt ^ e) 
The file data consistent with Off t2^ rvf ™ ^ 
..)as Ul efi,eco„ tenthasbe e nupdatedbetween y tland h ; W ^ 

Qw(f, „>; Us(f , tl); ^ , 2); ^ (2); Uc(f> ^ Q) W 

5- Ge, Write tokens^ " Updated in place with token , . 
Commit -Access with token 2 

^efiie is updated inpiace and con^itted using onewrite token. The meisthen 

^^"^^^^'■^^^^ctt,,,^^^^^ 
^c,,a consistent wi ftQw(f , t2)isC(f>W)orC(f> 

bUt,haUh ' ShaPP — ""-ritetokenins.eadofa.adtoke, ' 
Wimtneemoodimentsofthe^ 

30 6 : a) *^ to, -^*-2-^ taPh , WBllltal 
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- Access with token 2 - SQL Update Commit 



The ffleis updated inpiace using one write token, bu, not y e, committed. The fiie 



F«e data consistent with Q r(f , Q)/Qw(t t2) ls C(f> 

^o m C (f ,,2 ) ,nce th e fi ,econ,en t ha sb eenupda t ed( by , h es a leu S !; 
With the embodiments of the invention, Rs (f.,2) fails since Tm > 
.lo W e d an d theupda,e s ,obevisi bl e,o, h e re ade, This is accomphshed b y 

up at u„ W commit. This makes itpossibie to perform the above timestamp cheek 
on, y ,f the reader is no, the same userid as the current updater. 

Aneed exists, be abie to differentiate the above from the flowing situation, 
whtch is not considered as an inconsistency: 

TO; Qw(f, us ( f, Ue(f , tI); Rs(e „ )/Us(f> tl) . Uc(f> 

As the sametoken that is used for the firs, update is used again, more appropriate!, 
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■he token returns the ,ates, state of the ffle data as written using earher with this 
token. 



20 



Two options are avai,ab,e to handle the case of a second write-token for the same 
user. Firstly, write access can be denied to even the same user with different token 
Chan what ts already tracked). Secondly liberaHy a„ow update based on the satne 
user-id. If uncommitted updates are decided to be made and alternatively visible 
and modifiab.e on accesses h y the same user, this situation is taken care of using a 
smnlar logic as described under the previous point. If th e other option is decided 
upon, since the wri.e-token is tracked until the update is committed, when a 
different token is used can be detected. In me distributed file system extension, this 
check happens a, the end after ali other conditions are covered, as mteraction with 
the file server is required. 



There is an implicit assumption here that other users are no, allowed to use the 
wnte-token as long as an update is in progress. 

* Get write token 1 - Update file in p,ace - Get read token 2 - DB rollback to earlier 
version - Access with read token 2 

Adatahase rollback to „ eariier version happens, and the H,e is accessed with a token , ha, 
longer valid after, he rollback) ucnisno 



25 I(f) ;Q w(f, „);Us(f, „); Uetf a y. Uc(f> „ ); ^ Q); M({ ^ 



t2) 



F.le data consistent with Qr(f, t2) is C(f , t2) ,, Cffi Uc(f _ ^ 
content is C(f, tl) which is the same as C(f, 1(f)). 

30 With the solution proposed, Rs(f, , 2 ) fai ,s since in mis case V(,oke n) > V ( commit) i e the 
vers.on-.d embedded in ,he read-,oken is different from the one in 1(f). 
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In the distributed filesystem case, where ,he client's in memory tab.e has a stale value of 
V(comm„) and T m (commit), the condition V(,oken) _ V(commi,) could succeed and 
thus may no. suffice to detect s,a,e inflation (since the client may have cached the 
vers™ ,d corresponding to ,2, i.e. Uc(f ,1) and no, ye, be aware of ,he rollback ,„ „) 
> However in ,ha, situation, me check Tm(access) - T m( commi,) would faii (because me 
cached value of Tm(commi,) is s ,i„ ,2 so far, while me rollback would have reverted the 
file's actual modtfication timestamp back to ,1), and trigger a refresh of the values of 
V(commit) and Tm(commit) from the server. 

As the foregoing embodiments of the invention illustrate, a ioose transaction model 
for updates to a file and its corresponding meta-data through a mediator is useful for 
dtrecly performing in-place edits of content data residing on stores externa, to the indexed 
meta-data store (the latter cou.d be a DBMS). This is subject to the requirement of 
ensunng consistency between the file content and the associated meta-data from a reader's 
perspective. The embodiments of the invention encode a version number in the handle for 
referencing the object associated with a given meta-data state. A thin interceptor layer on 
the native store, where the objects content are stored, decodes this version number and 
compares the decoded version number with the version number of the latest comm.tted 
vers.onofmenle. This is done to determine if the handle refers to the current vers.on If 
the version matches, the thin interceptor layer checks for uncommitted updates by 
comparing the las, modification time s,amp of me f,,e wi,h ,he las, modification times.amp 
for me la.es, committed version. ,f mese ma,ch, me min interceptor layer allows access to 
proceed as usua. for the file. Otherwise, the min interceptor layer reports an error 
indtcating that the handle refers to stale data. 

The embodiments of the invention are advantageous in a number of ways The 
embodiments enable in-place updates of file content ,„ be made direcly on the file system 
whtch ,s separate from the meta-data store. This is done while ensunng mat a reader ' 
apphcation does no, see an inconsistency between the meta-data and the file data Another 
advantage is tha, the consistency check on fi.e access can be performed without contacting 
the meta-data server, which is separate from the meta-data store. This results in fas. 
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content access times even when the file and meta-data servers are distributed. 

Advantageously, this approach is suitable for a distributed model for file and 
meta-data storage, since the file content access path checks do not require any direct 
5 communication with the meta-data server. This approach extends easily to a configuration 
where the external store is a distributed file system and the content is accessed directly 
from distributed file system clients. This works even in the presence of authoritative 
caching (as in DCE-DFS) with minimal communication overheads and does so using an 
entirely client initiated approach, without any client specific state being required to be 
10 maintained on the servers. The approach may be enhanced to work correctly (in terms of 
preventing accesses to potentially inconsistent data) with mobile file systems (e.g., Coda 
or Intermezzo) that operate in disconnected mode. The consistency check makes efficient 
use of caching for improved performance without losing correctness. 

1 5 Further, the method described does not require clock synchronization between the 

database server and the filesystem server, and even with the filesystem client in the 
distributed filesystem case. Note that even in the scenario where the clocks are 
synchronized, it is not efficient for the DB server to save the time when the latest update 
was committed, Tl(Commit) (instead of V(commit)), and provide it to the mediator 

20 daemon at transaction commit to save the same (Tl(commit) value) for the corresponding 
files in the local repository on the file server. The reason is that this would require the 
database server to track all the rows involved in the linkfile operation for a particular 
transaction, update them prior to doing its commit processing with the T(Commit) value, 
and the same would have to be done at the file management daemon's end. 

25 

The solution is capable of covering cases where the database has been rolled back 
to an earlier state, as might happen in the case of a point-in-time restore, and should also 
work with database replication, as long as the legitimacy of the last modified timestamp of 
the file is maintained during a restore and across replicas. 

30 

The embodiments of the invention are preferably implemented using one or more 
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with an appiication or reader „ ff , 1 T for use 

w uon or reader, are effected by instructions in the software Th» «, 

w^otnerco.pon:::^ alS ° beaP «^— arenn.t.mse 

computer program product inHn^ Y he com P ut «-. A 

a compute p^ ^'^»-*-» tohj ^ loaw ,„ 
-puterpr^l;™; " beCarried0 '"''^— r-Theuseofthe 

offi.eandmeta.data updates ^^^^ b '^«»^mo (W 

20 devices ^^r^^^"^'^^^ 
connected to the c^T^ ^^^^"-^^ 
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workstation or the like The fnreo • Sparestatton (TM), a 

embodies, are res ^T^T ^^V 0 ""*** 

-puterreadabien^ 

-orageoffteprogrTr:? , ' ^ P~ Intend 
5 dove. g eSem ' COndUCt0rmem ^P-iW y i nconcertwiththehardd . k 

« .ape, a^M ^ 
Mission h^TbeZr ' 0 ™^ 

relevan, contputer readaMe mediums ^ 8 " ° f 
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